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MULTILEVEL MONTE CARLO ALGORITHMS FOR LEVY-DRIVEN 
SDES WITH GAUSSIAN CORRECTION 

By Steffen Dereich 

Philipps- Universitdt Marburg 

We introduce and analyze multilevel Monte Carlo algorithms for 
the computation of E/(F), where Y = {Yt)t£[o,i] is the solution of 
a multidimensional Levy-driven stochastic differential equation and 
/ is a real-valued function on the path space. The algorithm relies 
on approximations obtained by simulating large jumps of the Levy 
process individually and applying a Gaussian approximation for the 
small jump part. Upper bounds are provided for the worst case error 
over the class of all measurable real functions / that are Lipschitz 
continuous with respect to the supremum norm. These upper bounds 
are easily tractable once one knows the behavior of the Levy measure 
around zero. 

In particular, one can derive upper bounds from the Blumenthal- 
Getoor index of the Levy process. In the case where the Blumenthal- 
Getoor index is larger than one, this approach is superior to algo- 
rithms that do not apply a Gaussian approximation. If the Levy 
process does not incorporate a Wiener process or if the Blumenthal- 
Getoor index /3 is larger than |, then the upper bound is of order 
.j--(4-/3)/(6/3) ^jjgjj ^jjg runtime r tends to infinity. Whereas in the 
case, where /3 is in [1, |] and the Levy process has a Gaussian com- 
ponent, we obtain bounds of order r~^^^^^~^\ In particular, the error 
is at most of order r"^''®. 

1. Introduction. Let (iy G N and denote by D[0, 1] the Skorokhod space 
of functions mapping [0, 1] to M'^"*' endowed with its Borel-u-field. In this 
article, we analyze numerical schemes for the evaluation of 

S{f) :=E[f{Y)], 

where 
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• Y = (Vf)(g[o,i] is a solution to a multivariate stochastic differential equa- 
tion driven by a multidimensional Levy process (with state space M*^^), 
and 

• / : D[0, 1] — )• M is a Borel measurable function that is Lipschitz continuous 
with respect to the supremum norm. 

This is a classical problem which appears for instance in finance, where Y 
models the risk neutral stock price and / denotes the payoff of a (possibly 
path dependent) option, and in the past several concepts have been employed 
for dealing with it. 

A common stochastic approach is to perform a Monte Carlo simulation of 
numerical approximations to the solution Y. Typically, the Euler or Milstein 
schemes are used to obtain approximations. Also higher order schemes can be 
applied provided that samples of iterated Ito integrals are supplied and the 
coefficients of the equation are sufficiently regular. In general, the problem 
is tightly related to weak approximation which is, for instance, extensively 
studied in the monograph by Kloeden and Platen [12] for diffusions. 

Essentially, one distinguishes between two cases. Either f{Y) depends 
only on the state of y at a fixed time or alternatively it depends on the 
whole trajectory of Y. In the former case, extrapolation techniques can often 
be applied to increase the order of convergence, see [21]. For Levy-driven 
stochastic differential equations, the Euler scheme was analyzed in [17] under 
the assumption that the increments of the Levy process are simulatable. 
Approximate simulations of the Levy increments are considered in [11]. 

In this article, we consider functionals / that depend on the whole tra- 
jectory. Concerning results for diffusions, we refer the reader to the mono- 
graph [12]. For Levy-driven stochastic differential equations, limit theorems 
in distribution are provided in [10] and [18] for the discrepancy between the 
genuine solution and Euler approximations. 

Recently, Giles [7, 8] (see also [9]) introduced the so-called multilevel 
Monte Carlo method to compute S{f). It is very efficient when y is a dif- 
fusion. Indeed, it even can be shown that it is — in some sense — optimal, 
see [5]. For Levy-driven stochastic differential equations, multilevel Monte 
Carlo algorithms are first introduced and studied in [6]. Let us explain their 
findings in terms of the Blumenthal-Getoor index (BG-index) of the driv- 
ing Levy process which is an index in [0,2]. It measures the frequency of 
small jumps, see (3), where a large index corresponds to a process which 
has small jumps at high frequencies. In particular, all Levy processes which 
have a finite number of jumps has BG-index zero. Whenever the BG-index 
is smaller or equal to one, the algorithms of [6] have worst case errors at 
most of order r~^/^, when the runtime r tends to infinity. Unfortunately, 
the efficiency decreases significantly for larger Blumenthal-Getoor indices. 
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Fig. 1. Order of convergence in dependence on the Blumenthal-Getoor index. 

Typically, it is not feasible to simulate the increments of the Levy process 
perfectly, and one needs to work with approximations. This necessity typi- 
cally worsens the performance of an algorithm, when the BG-index is larger 
than one due to the higher frequency of small jumps. It represents the main 
bottleneck in the simulation. In this article, we consider approximative Levy 
increments that simulate the large jumps and approximate the small ones by 
a normal distribution {Gaussian approximation) in the spirit of Asmussen 
and Rosihski [2] (see also [4]). Whenever the BG-index is larger than one, 
this approach is superior to the approach taken in [6], which neglects small 
jumps in the simulation of Levy increments. 

To be more precise, we establish a new estimate for the Wasserstein met- 
ric between an approximative solution with Gaussian approximation and the 
genuine solution, see Theorem 3.1. It is based on a consequence of Zaitsev's 
generalization [22] of the Komlos-Major-Tusnady coupling [13, 14] which 
might be of its own interest itself, see Theorem 6.1. With these new esti- 
mates, we analyze a class of multilevel Monte Carlo algorithms together with 
a cost function which measures the computational complexity of the individ- 
ual algorithms. We provide upper error bounds for individual algorithms and 
optimize the error over the parameters under a given cost constraint. When 
the BG-index is larger than one, appropriately adjusted algorithms lead to 
significantly smaller worst case errors over the class of Lipschitz functionals 
than the ones analyzed so far, see Theorem 1.1, Corollary 1.2 and Figure 1. 
In particular, one always obtains numerical schemes with errors at most of 
order r~^/^ when the runtime r of the algorithm tends to infinity. 
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Notation and universal assumptions. We denote by | • | the Euclidean 
norm for vectors as well as the Frobenius norm for matrices and let || • || 
denote the supremum norm over the interval [0, 1]. X = {Xt)t>o denotes an 
dx-dimensional L^-integrable Levy process. By the Levy-Khintchine for- 
mula, it is characterized by a square integrable Levy-measure v [a Borel 
measure on IR'^^\{0} with J \ x\'^i'{dx) < oo], a positive semi-definite matrix 
EE* (S being a dx x dx -matrix), and a drift b S M.'^^ via 

where 

^P{9) = + {b,e) + / (e^<^'^> - 1 - i{e,x))u{dx). 

Briefly, we call X a (i/, SS*, 6)-Levy process, and when 6 = 0, a (i/, SS*)- 
Levy martingale. All Levy processes under consideration are assumed to be 
cadlag. As is well known, we can represent X as sum of three independent 
processes 

Xt = j:wt + Lt + bt, 

where W = {Wt)t>o is a -dimensional Wiener process and L = {Lt)t>o is 
a L^-martingale that comprises the compensated jumps of X. We consider 
the integral equation 

(1) Yt = y^+ f a{Yt.)dXt, 

Jo 

where yo G M^^ is a fixed deterministic initial value. We impose the standard 
Lipschitz assumption on the function a : M.'^^ — M'^"*'' ^ '^^ : for a fixed K <oo, 
and all y,y' M"^^ , one has 

\a{y) -a{y')\<K\y-y'\ and \a{yo)\<K. 

Furthermore, we assume without further mentioning that 

y |xpz^(dx) < m<K and \b\<K. 

We refer to the monographs [3] and [20] for details concerning Levy pro- 
cesses. Moreover, a comprehensive introduction to the stochastic calculus 
for discontinuous semimartingales and, in particular, Levy processes can be 
found in [16] and [1]. 

In order to approximate the small jumps of the Levy process, we need to 
impose a uniform ellipticity assumption. 
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Assumption UE. There are f) G (0, 1], and a linear subspace 7i 

of M.'^^ such that for all h G (0, f)] the Levy measure i^\B(o,h) is supported on 
Ti and satisfies 

4/ {y,xfu{dx)< I {y',xfu{dx)<'d I {y,xfu{dx) 

JB(0,h) JB{0,h) JB{0,h) 

for all y,y' ^Ti with \y\ = \y'\. 

Main results. We consider a class of multilevel Monte Carlo algorithms A 
together with a cost function cost [0, oo) that are introduced explicitly 

in Section 2. For each algorithm S G A, we denote by S{f) a real-valued 
random variable representing the random output of the algorithm when 
applied to a given measurable function / : Z)[0, 1] — t- M. We work in the real 
number model of computation, which means that we assume that arithmetic 
operations with real numbers and comparisons can be done in one time 
unit, see also [15]. Our cost function represents the runtime of the algorithm 
reasonably well when supposing that 

• one can sample from the distribution i^|_b(o,/i)= / ^^(-^(0) hy) ttis uniform 
distribution on [0, 1] in constant time, 

• one can evaluate a at any point y G W^^ in constant time, and 

• / can be evaluated for piecewise constant functions in less than a constant 
multiple of its breakpoints plus one time units. 

As pointed out below, in that case, the average runtime to evaluate S{f) is 
less than a constant multiple of cost(S'). We analyze the minimal worst case 
error 

err(r)= mf sup E[|5(/) - Y^', r > 1. 

S&A: /eLip(l) 

COSt(5)<T 

Here and elsewhere, Lip(l) denotes the class of measurable functions f :D[0, 
1] — 7- M that are Lipschitz continuous with respect to supremum norm with 
coefficient one. 

In this article, we use asymptotic comparisons. We write f ^ g for < 
liminf ^ < limsup ^ < oo, and f ^ g or, equivalently g f, for limsup ^ < 
oo. Our main findings are summarized in the following theorem. 



Theorem 1.1. Assume that Assumption UE is valid and let g : (0, oo) 
(0, oo) be a decreasing and invertible function such that for all h> 

J ^-j^AMdx)<g{h) 
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and, for a fixed 7 > 1, 

(2) g(^h]>2g{h) 



, 2 

for all sufficiently small h> 0. 
(I) lfi: = or 

g~^{x)'^x~^^^ as X —)• 00, 

then 

err(r) ;^5-i((rlogr)2/3)rV6(iogr)2/3 as 00. 

(11) // 



9 ^{^)^^ as x—)- 00, 



then 



err T < / — -— as r 00, 
^ ' ^ 9*{r) 

where g*{T) = inf{x > 1 : x^g^^{x)^{logx)~^ > r}. 



The class of algorithms A together with appropriate parameters which 
establish the error estimates above are stated explicitly in Section 2. 
In terms of the Blumenthal-Getoor index 

(3) /3:=inf|p>0: / |x|Pi/(dx) < 00 I G [0, 2] 

I ^5(0,1) J 

we get the following corollary. 



Corollary 1.2. Assume that Assumption UE is valid and that the BG- 
index satisfies /3>1. If'E = or/3>|, then 

4-/3 

sup{7 > : err(r) ;^ r '''} > 



6/3 



and, ifS^O and /3 < I , 



3' 

/3 



sup{7 > : err(T) r '''} > 



6/3-4 
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Visualization of the results and relationship to other work. Figure 1 illus- 
trates our findings and related results. The x-axis and y-axis represent the 
Blumenthal-Getoor index and the order of convergence, respectively. Note 
that MLMC stands for the multilevel Monte Carlo algorithm which does 
not apply a Gaussian approximation, see [6]. Both lines marked as MLMC 
1 illustrate Corollary 1.2, where the additional (G) refers to the case where 
the SDE comprises a Wiener process. 

These results are to be compared with the results of Jacod et al. [11]. 
Here an approximate Euler method is analyzed by means of weak approx- 
imation. Li contrast to our investigation, the object of that article is to 
compute E/(Xr) for a fixed time T > 0. Under quite strong assumptions 
(for instance, a and / have to be four times continuously differentiable and 
the eights moment of the Levy process needs to be finite) , they provide error 
bounds for a numerical scheme which is based on Monte Carlo simulation 
of one approximative solution. In the figure, the two lines quoted as JKMP 
represent the order of convergence for general, respectively pseudo symmet- 
rical. Levy processes. Additionally to the illustrated schemes, [11] provide 
an expansion which admits a Romberg extrapolation under additional as- 
sumptions. 

We stress the fact that our analysis is applicable to general path dependent 
functionals and that our error criterion is the worst case error over the 
class of Lipschitz continuous functionals with respect to supremum norm. 
In particular, our class contains most of the continuous payoffs appearing in 
finance. 

We remark that our results provide upper bounds for the inferred error 
and so far no lower bounds are known. The worst exponent appearing in our 
estimates is g which we obtain for Levy processes with Blumenthal-Getoor 
index 2. Interestingly, this is also the worst exponent appearing in [19] in 
the context of strong approximation of SDEs driven by subordinated Levy 
processes. 

Agenda. The article is organized as follows. In Section 2, we introduce 
a class of multilevel Monte Carlo algorithms together with a cost function. 
Here, we also provide the crucial estimate for the mean squared error which 
motivates the consideration of the Wasserstein distance between an approxi- 
mative and the genuine solution, see (6). Section 3 states the central estimate 
for the former Wasserstein distance, see Theorem 3.1. In this section, we ex- 
plain the strategy of the proof and the structure of the remaining article 
in detail. For the proof, we couple the driving Levy process with a Levy 
process constituted by the large jumps plus a Gaussian compensation of the 
small jumps and we write the difference between the approximative and the 
genuine solution as a telescoping sum including further auxiliary processes, 
see (9) and (10). The individual errors are then controlled in Sections 4 and 
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5 for tfie terms wliicfi do not depend on tlie particular choice of the coupling 
and in Section 7 for the error terms that do depend on the particular choice. 
In between, in Section 6, we establish the crucial KMT like coupling result 
for the Levy process. Finally, in Section 8, we combine the approximation 
result for the Wasserstein metric (Theorem 3.1) with estimates for strong 
approximation of stochastic differential equations from [6] to prove the main 
results stated above. 

2. Multilevel Monte Carlo. Based on a number of parameters, we define 
a multilevel Monte Carlo algorithm S: We denote by m and ni, . . . , nat- 
ural numbers and let ei, . . . , and hi, ... , hm denote decreasing sequences 
of positive reals. Formally, the algorithm S can be represented as a tuple 
constituted by these parameters, and we denote by A the set of all possi- 
ble choices for S. We continue with defining processes that depend on the 
latter parameters. For ease of notation, the parameters are omitted in the 
definitions below. 

We choose a square matrix such that = /^^^ ^^^-^XiXjX 

^{dx). Moreover, for /c = 1, . . . , m, we let L^*^) = (l|'^'*)j>o denote the (i^|_B(o,/ife)=5 
0)-Levy martingale which comprises the compensated jumps of L that are 
larger than hk, that is 

^'t^ = y2 l{|AL3|>hfe}ALs - t / xu{dx). 

Here and elsewhere, we denote AL^ = Lt — Lt^. We let B = {Bt)t>o be an 
independent Wiener process (independent of W and L^''^), and consider, for 

k = 1, . . . ,m, the processes AfC^) = {T,Wt + + + bt)t>o as driving 

processes. Let T^'^) denote the solution to 

Tf) = yo+ / a(TfJ)d'^.W(s), 

where (i(^)(t))f>o is given via t'-'^^t) = max(l(''') n [0,t]) and the set l'^'''> is 
constituted by the random times {Tj''^)j^z+ that are inductively defined via 
Tfj''^ = and 

Tji\ = inf{t G {T^^\oo) : \ALt\ > or t = Tf ^ + e^}. 
Clearly, T(*^) is constant on each interval [Tj''\Tj^-^^) and one has 
(4) = T^'l + a(T('i,)(Af^(., - X^^,,). 



MLMC FOR LEVY SDE WITH GAUSSIAN CORRECTION 9 
Note that we can write 

m 

E[/(T("))] = ^E[/(T('=)) - f(T^^-^^)] + E[/(T(i))]. 

k=2 

The multilevel Monte Carlo algorithm — identified with 5 — estimates each 
expectation E[/(T('=)) - /(TC'"^))] (resp., E[/(T(i))]) individually by sam- 
pling independently (resp., ni) versions of f{T^^^) — f{T'^^~^^) [/(T^-^^)] 
and taking the average. The output ^f the algorithm is then the sum of the 
individual estimates. We denote by S{f) a random variable that models the 
random output of the algorithm when applied to /. 

The mean squared error of an algorithm. The Monte Carlo algorithm 
introduced above induces the mean squared error 

m ^ 

mse(S, /) = |E[/(y)] - E[/(T(™))]|2 + ^ - var(/(TW) - /(T^'^-^)) 

+ -var(/(T«)), 

ni 

when applied to /. For two I?[0, l]-valued random elements Z^^^ and Z^'^\ we 
denote by W{Z^^\ Z^"^^) the Wasserstein metric of second-order with respect 
to supremum norm, that is 

(5) = inf ( / - z(2)||2d^(^(l)^^(2)) ■ 



where the infimum is taken over all probability measures on D[0, 1] x 
-D[0, 1] having first marginal lP^{i) and second marginal P^(2). Clearly, the 
Wasserstein distance depends only on the distributions of Z^^^ and Z^'^h 
Now, we get for / S Lip(l), that 

m ^ 

mse{S, f) < W{Y, T^'"))^ + ^ — EiHT^'^) - T^^^-^^f ] 

k=2 "'^ 

(6) 

+ _E[||TW-yof]. 
ni 

We set 

mse(S') = sup mse(5,/), 
/eLip{i) 

and remark that estimate (6) remains valid for the worst case error mse(S'). 

The main task of this article is to provide good estimates for the Wasser- 
stein metric W(y, T^™)). The remaining terms on the right-hand side of (6) 
are controlled with estimates from [6]. 
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The cost function. In order to simulate one pair (T^'''~"^\ T^''^), we need 
to simulate all displacements of L of size larger or equal to on the time 
interval [0, 1] . Moreover, we need the increments of the Wiener process on the 
time skeleton (iC'"^) UlC')) n [0, 1]. Then we can construct our approximation 
via (4). In the real number model of computation (under the assumptions 
described in the Introduction), this can be performed with runtime less 
than a multiple of the number of entries in I^'^) n [0, 1], see [6] for a detailed 
description of an implementation of a similar scheme. Since 



E[#(#M[0,1])]<1 + — + E 



te[o,i] 



1/(5(0, /ifc) 



we define, for S ^ A, 

m 

cost (5) = Uk 



k=l 



i^iB{0,hkT) + — + l 



Then supposing that ei < 1 and h'{B{0, hkY) < for k = 1, . . . ,m, yields 
that 

(7) 



m ^ 

cost (5) < 3 — . 

7 1 ^k 



k=l 



Algorithms achieving the error rates of Theorem 1.1. Let us now quote 
the choice of parameters which establish the error rates of Theorem 1.1. In 
general, one chooses = 2"'^ and = g~^{2^) for k G Moreover, in 
case (I), for sufficiently large r, one picks 



m ■■ 



Llog2Ci(Tlogr)2/3j and 



nk 



C2rV3(logr)-2/3^_ill 
V & ; 5-1 (2™) 

for k = 1, 



,m, 



where Ci and C2 are appropriate constants that do not depend on r. In case 
(II), one chooses 



m = [log2Ci5*(r)J and rik 



Co 



g*{Ty g-H2") 
log5*(T) 9-^2'"). 

for k = 1, . . . ,m, 



where again Ci and C2 are appropriate constants. We refer the reader to 
the proof of Theorem 1.1 for the error estimates of this choice. 



MLMC FOR LEVY SDE WITH GAUSSIAN CORRECTION 



11 



3. Weak approximation. In this section, we provide the central estimate 
for the Wasserstein metric appearing in (6). For ease of notation, we denote 
by e and h two positive parameters which correspond to h^'^^ and e^"^^ above. 
We denote by S' a square matrix with = (/^^g XiXju{dLx))ij(z^i^ iix}- 

Moreover, we let L' denote the process constituted by the compensated 
jumps of L of size larger than /i, and let B = {Bt)t>o be a dx -dimensional 
Wiener process that is independent of W and L' . Then we consider the 
solution T = (Tf)t>o of the integral equation 



Jo 



where X = {Xt)t>o is given as Xt = T,Wt + S'-Bt + L[ + bt and = max(I n 
[0,t]), where I is, in analogy to above, the set of random times {Tj)j^z+ 
defined inductively via Tq = and 

Tj+i = inf{i G (Tj, oo) : \ALt\ >hort = T^ + e} for j G Z+. 

The process T is closely related to T^™^ from Section 2 and choosing e = 
Em and h = hm, implies that {'^i,(t))t>o and T^™) are identically distributed. 

We need to introduce two further crucial quantities: for h> 0, let F{h) = 
lBio,h) kPi^(dx) and Fo{h) = J^^Q f^-^,xu{dx). 

Theorem 3.1. Suppose that Assumption UE is valid. There exists a 
finite constant k that depends only on K , dx o,nd such that for e G (0, 
e' G [2e, 1], and h G (0, f)] with i^{B{0, HY) < i one has 



w(y,T,(.))2<K 



e ' 

F(/i)e' + — logf — ^ Vej +elog- 



and, i/ S = 0, one has 

\2/l 



w(y,T,(.))2<K 



F(/.)(e' + elog^)+^log(^^Vey + |6-Fo(/i)|V 



Corollary 3.2. Under Assumption UE, there exists a constant k = 
K{K,dx,T^) such that for all ee [0,1] andhe{0,[}] with u{B{0,hY)V ^ < 



^, one has 



w(y,T,(.))2<Aj(^/i2-L + £^ log J, 



and, in the case where S = 0, 



W(y, T,(.))2 < ^(h^^log ^ + \b- Fo(^)l'e') • 
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Proof. Choose e' = \/elogl/e and observe that e' > 2e since e < 

Using that < ^(/i) < ^, it is straight forward to verify the estimate with 
Theorem 3.1. □ 

3.1. Strategy of the proof of Theorem 3.1 and main notation. We repre- 
sent X as 

Xt = T.Wt + L[ + L'l + bt, 

where L" = {L'-l)t>o = L — L' is the process which comprises the compensated 
jumps of L of size smaller than h. Based on an additional parameter e' € 
[2e,l], we couple L" with T,B. The introduction of the explicit coupling is 
deferred to Section 7. Let us roughly explain the idea behind the parameter 
e' . In classical Euler schemes, the coefficients of the SDE are updated in 
either a deterministic or a random number of steps of a given (typical) 
length. Our approximation updates the coefficients at steps of order e as the 
classical Euler method. However, in our case the Levy process that comprises 
the small jumps is ignored for most of the time steps. It is only considered 
on steps of order of size e' . 

On the one hand, a large e' reduces the accuracy of the approximation. 
On the other hand, the part of the small jumps has to be approximated by a 
Wiener process and the error inferred from the coupling decreases in e' . This 
explains the increasing and decreasing terms in Theorem 3.1. Balancing e' 
and e then leads to Corollary 3.2. 

We need some auxiliary processes. Analogously to I and l, we let J denote 
the set of random times (Tj)ji^z+ defined inductively by Tq = and 

Tj+i = min(I n {Tj + e' - e, oo)) 

so that the mesh-size of JJ is less than or equal to e' . Moreover, we set 
r/(t) =max(JIn [0,t]). 

Let us now introduce the first auxiliary processes. We set X' = (Xt — 
L'l)t>o and we consider the solution Y' = (y'/)t>o to the integral equation 

(8) Yl = yo + l^ a(y/(,_)) dx; + 1^ a(i;'(,_)) dL;'(,) 

and the process Y = {Yt)t>o given by 

yt = y/ + a(i;'(,))(L'/-L:;(,)). 

It coincides with Y' for all times in J and satisfies 

Yt = yo+ ra(y/(,„))dx^+ ra(y,(,_))d4'. 

Jo Jo 
Next, we replace the term L" by the Gaussian term Ti'B in the above 
integral equations and obtain analogs of Y' and Y which are denoted by T' 
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and T. To be more precise, T' = {T[)t>o is the solution to the stochastic 
integral equation 

T; = yo + 1^ a{f[^^_^)dXi + 1^ a(T;(,„))S' di?,(,) , 

and T = {'Tt)t>o is given via 

Ti = T^ + a(T;(,))S'(i?<-S,(,)). 

We now focus on the discrepancy of Y and T^j-.) . By the triangle inequality, 
one has 

(9) \\Y - T,(.)|| < \\Y - Y\\ + \\Y - T|| + ||T - T|| + ||T - T,(.)||. 
Moreover, the second term on the right satisfies 

(10) ||y - T|| < \\Y' - T'li + ||y - y' - (T - t')||. 

In order to prove Theorem 3.1, we control the error terms individually. The 
first term on the right-hand side of (9) is considered in Proposition 4.1. The 
third and fourth term are treated in Propositions 5.1 and 5.2, respectively. 
The terms on the right-hand side of (10) are investigated in Propositions 7.1 
and 7.2, respectively. Note that only the latter two expressions depend on the 
particular choice of the coupling of L" and 'E'B. Once the above-mentioned 
propositions are proved, the statement of Theorem 3.1 follows immediately 
by combining these estimates and identifying the dominant terms. 

4. Approximation of 1^ by Y. 

Proposition 4.1. There exists a constant k> depending on K only 
such that, for e G (0, e' G [2e, 1] and h> with v{B{0, hY) < i, one has 



E 

i/ S = 0, and 



sup \Yt-Yt\^ 
te[o,i] 



<K[F{h)e' + \h-FQ{h)\'e% 



(11) E sup \Yt-Yt\^ <K{e + F{h)e') 

for general S. 

Proof. For t > 0, we consider Zt = Yt- Yt, Z[ = Yt- Y^^^y Z'/ = Yt- 

Yj^(t) ^-iid z{t) = E[sup5g[o,i] l-^sP]- The main task of the proof is to establish 
an estimate of the form 

z{t)<ai / z{s)ds + a2 
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for appropriate values ai,a2 > 0. Since z is finite (see, for instance, [6]), 
then Gronwall's inequality implies as upper bound: 



E 



sup jy^ - Ys\ 

■se[o,i] 



< Q!2 exp(ai). 



We proceed in two steps. 
1st step. Note that 



(n„) -«(!;'(,_))) d(St^, + L'J+ / (a(y,„)-a(y,(,_)))d4' 



■■Mt 



+ 



r(a(n_)-a(y/(,_)))6d., 
Jo 



so that 
(12) 



(a(y,_)-a(y,(,_)))6d5 



For t E [0, 1] , we conclude with the Cauchy-Schwarz inequality that the 
second term on the right-hand side is bounded by 2K'^ |Z^_pds. 

Certainly, {Mt) is a (local) martingale with respect to the canonical filtra- 
tion, and we apply the Doob inequality together with Lemma A.l to deduce 
that 



a(y,_)-a(y;(,_))|2d(St^ + L') 



E 


sup iM^p 


< 4E 






'-se[o,t] J 







+ / \a{Ys^)-a{Y^^,_))\'d{L"), 



Here and elsewhere, for a multivariate local L^-martingale S = {St)t>o, we 
denote {S) = ^^jiS^^^) and {S^^^) denotes the predictable compensator of 
the classical bracket process of the jth coordinate S^^^ of S. Note that 
d{Y:W + L')t = (|S|2 + z^), |x|2i/(dx)) dt < 2K^ dt and d{L")t = F{h) dt. 
Consequently, 



E 


sup \Ms\^ 


<4E 


2K^ [ 




Ls6[0,i] J 




Jo 



Z',fds + K'^F{h) I \Z'J\'^ds 



7"\2 . 



Hence, by (12) and Fubini's theorem, one has 



E 


1 rz |2 

sup \Zs\ 








Jo 



<Ki / [z{s)+E[\ZX]+F{h)E[\Z'J\']]ds 
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for a constant ki that depends only on K. Since Z[ = Zt + Yt — y[^i^ and 
Z'^ = Zt + Yt-Y^^t), we get 

(13) z{t) < +E[|Y, -y/(,)|2] + F(/i)E[|Y, - Y^^s)?]]ds 

Jo 

for an appropriate constant K2 = K2 (K) . 

2nd step. In the second step we provide appropriate estimates for E[|1^ — 
■^/(t)l^] ^^"^ — 5^r;(t)P]- The processes W and L" are independent of the 
random time i{t). Moreover, L' has no jumps in and we obtain 

Yt - y^t) = n - yiit) + «(>;;w)(it - 

= a{Y[(^){nWt - W^.(i)) + {h- F,{h)){t - Lit))) 
+ a{Yv{t))i^t - L'rif^t)) 

so that 

mt - y;;,/] < sK'miY^^,^ - yo\ + ifmi'^ + \b- Fo(/i)i V) 

+ E[i\Y^^t)-yo\ + lf]F{h)e']. 
By Lemma A. 2, there exists a constant K3 = K3{K) such that 

(14) E[\Yt - < At3[|S|2e + \b- Fo(/i)| V + F{h)e']. 

Similarly, we estimate E[|Y'j — Given T/(t), {L'^(^^^^^ — 

^'r]{t))ue[o,{£'-e)A{t-r]{t))] is distributed as the unconditioned Levy process 
L' on the time interval [0, (s' — e) A (t — r]{t))]. Moreover, we have dL^ = 
— Fo(/i) du on {r](t) + e' — £,t]. Consequently, 

Yt - Y^it) = I l{.-^(t)<.'-.}a(%-)) d(STy, + L', + hs) 
hit) 

+ / l{s^rm>e'-e}a{Z{s-)) d(ST^. + {h - Fo{h))s) 

+ «(%t))(^t'-^^'(t))> 

and analogously as we obtained (14) we get now that 

E[\Yt - |2] < K4[e' + \b- Fo{h)\^e^] 

for a constant K4 = K4^{K). Next, note that, by the Cauchy-Schwarz inequal- 
ity, |Fo(/i)p < j^iohY kP'^(da^) ■ ^{B{^,hY) < ^ so that we arrive at 

E[|yt-y,(t)|']<^5e'. 
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Combining this estimate with (13) and (14), we obtain 

z{t)<K2 [ z{s)ds + Ke[m^e + F{h)e' + \b-Fo{h)fe^]. 
Jo 

In the case where S = 0, the statement of the proposition follows imme- 
diately via Gronwah's inequahty. For general S, we obtain the result by 
recalling that |Fo(/i)|2 < i^. □ 

5. Approximation of T by 

Proposition 5.1. Under the assumptions of Proposition 4-1, one has 

E[\\f -Tf]<Ke'F{h) 
for a constant k depending only on K . 



Proof. The proposition can be proved as Proposition 4.1. Therefore, 
we only provide a sketch of the proof. The arguments from the first step 
give, for t e [0, 1], 

z{t) < Az(s) +E[|T,(,) - T:(,)P] +F(/i)E[|T,(,) - T,(,)|2]]ds, 

JO 

where z{t) = E[sup5g[o,t] I"""* ~ ^sP] and ki = k,i{K) is an appropriate con- 
stant. 

Moreover, based on Lemma A. 2 the second step leads to 

E[|T,(,)-T:(,)|2]<^2e'F(/i) and E[|T,(,) - T,(,) 1^] < A^se' 

for appropriate constants K2 = k-2{K) and ^3 = k^{K). Then Gronwall's 
lemma implies again the statement of the proposition. □ 

Proposition 5.2. Under the assumptions of Proposition 4-i, there ex- 
ists a constant k depending only on K and dx such that, ifTi = 0, 



E 



te[o,i] 



< K 



F{h)elog^ + \b-Fo{h)\h^ 



and, in the general case, 



E 



■te[o,i] 



< KE log - . 

e 



Proof. Recall that by definition 



T, -T 



.(t) 



f a{T ,(^,_)) dXs 
Ji{t) 
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so that 



|Tt - T,(i)|' < i^2(|T,(,) - yo\ + lf\Xt - X,(^t)f- 
Next, we apply Lemma A. 4. For j £ we choose 

Uj = \TT'M-yo\^ and Vj = sup \Xt - X,(_t)\^ 

with the convention that the supremum of the empty set is zero. Then 

|2" 



E 



sup \Tt - T,(j)| 
te[o,i] 



< E 


sup Uj 


•E 


sup Vj 








Ljez+ J 


< E 


sup (|Ti - 


yo\ + i? 




46 [0,1] 







■E 



sup \Xt — Xs 

0<s<t<l 



By Proposition 5.1 and Lemma A. 2, E[sup4g[o,i] (l^t ~ ^ol + 1)^] is bounded 
by a constant that depends only on K. 

Consider : [0, 1] — [0, oo), (5 i— ^J 5\og{e/ 8) . By Levy's modulus of conti- 
nuity, 

\Wt-W,\ 



\\W\\^:= sup 

o<s<t<i y^{t-s) 

is finite almost surely, so that Fernique's theorem implies that E[||VF||^] is 
finite too. Consequently, 



E 



(15) 



sup \Xs - 

se[o,t] 



< 3 



+ F(/i))E[||Ty||^]elog ^ + |6 - Fo(/i)| V 

2 ^ K' 



The result follows immediately by using that |-Fo(/i)| < — and ruling out 
the asymptotically negligible terms. □ 



6. Gaussian approximation via Komlos, Major and Tusnady. In this sec- 
tion, we prove the following theorem. 

Theorem 6.1. Let h> and L = {Lt)t>o be a d- dimensional (i^, 0)- 
Levy martingale whose Levy measure v is supported on B{0,h). Moreover, 
we suppose that for i? > 1 , one has 

J {y ,x)^v{dx) <-d j {y,x)'^v{dx) 
for any y,y' G M.'^ with \y\ = \y'\, and set cr^ = f \x\'^v{dx). 
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There exist constants ci , C2 > depending only on d such that the following 
statement is true. For every T > 0, one can couple the process (-Z>t)te[o,T] with 
a Wiener process (-Bt)t6[o,T] such that 

where T, is a square matrix with SS* = cov^^ and = J |xpz^(dx). 

The proof of the theorem is based on Zaitsev's generahzation [22] of the 
Komlos-Major-Tusnady couphng. In this context, a key quantity is the Za- 
itsev parameter: Let Z he a d-dimensional random variable with finite ex- 
ponential moments in a neighborhood of zero and set 

A(e) = logEexp{(^,Z)} 

for all G C with integrable expectation. Then the parameter is defined as 

T{Z)=mi{T>0:\d^d^A{e)\ <t{covzv,v) for all 6 G&,v,w gR'^ 

with 1^1 < and \w\ = \v\ = 1}. 

In the latter set, we implicitly only consider r's for which A is finite on a 
neighborhood of {x G C*: \x\ < l/r}. Moreover, cov^ denotes the covariance 
matrix of Z. 

Proof of Theorem 6.1. 1st step: First, consider a d-dimensional in- 
finitely divisible random variable Z with 

A{e) :=logEe<^'^> = / (e<^'^> - {e,x) - l)j^'(dx). 



where the Levy measure z^' is supported on the ball B{0,h') for a fixed h' > 0. 
Then 



A{e)= [ {w,x){v,xfe^^''-'K{dx) 

JBiO.h') 



'B{0,h 

and 

(cov^ t;, z;) = var(7;, Z) = 9^A^(0) = / {v,x)'^i'{dx). 

JB{a,h') 

We choose C > with = 1/C, and observe that for any 6 G C^,v,w G M*^ 
with 16*1 < C,/h' and \w\ = \v\ = 1, 

\d^dlm\ < h'e\'\'''{covzv,v) < ^{covzv,v). 
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Hence, 

2nd step: In the next step, we apply Zaitsev's coupling to piecewise con- 
stant interpolations of {Lf). Fix m E N and consider L^"*) = (-Z^j'"^)tG[o,T] 
given via 

— -'^[2™t/T'j2-'"T- 

Moreover, we consider a d-dimensional Wiener process B = {Bt)t>o and its 
piecewise constant interpolation given by = (i?[2™i/T'j2-"'T)te[0,T]- 

Since cov^^ is self-adjoint, we find a representation cov^^ =tUDU* with 
D diagonal and U orthogonal. Hence, for At := {tD)~^/'^U* we get cov^^^^ = 
Id- We denote by Ai the leading and by A2 the minimal eigenvalue of D (or 
covli)- Then AfLt is again infinitely divisible and the corresponding Levy 
measure is supported on B{0, h/\/X2t). By part one, we conclude that 

r{AtLt) < ^ 



Now the discontinuities of A2-mL^"^'' are i.i.d. with unit covariance and 
Zaitsev parameter less than or equal to ^^^^^ . By [22], Theorem 1.3, one 
can couple the processes L and T,B on an appropriate probability space such 
that 



t6[0,T] 



Eexp<{ ^i ^^/2/^ 1^2-™-^! '-A2-mJ:Bl ^1 I < expj K2log( — Ve 



where ki,K2 > are constants only depending on the dimension d. The 
smallest eigenvalue of A2-m. is 2™'/2(TAi)~^/^ and, by assumption, Ai < 'i?A2. 
Since A2 < o"^, we get 



Eexp<j Ki sup —T,Bl | /■ < exp< K2 log — — Ve 

V'&hte[o,T] J I V 



3rd step: The general result follows by approximation. First, note that 

Pte[o,T] \Lt-Ll '\ cor 
dominated convergence 



suptg[o,T] — converges as m — )• 00 to sup^gp^j^] \Lt — Lt-\ so that by 



lim Eexp-^ ki sup |Lt — 

V^ht£[0,T] 

■ Eexp-^ Ki sup |Lt — I ^ < 

V-dh te[o,T] 
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Analogously, linim-s.oolEexplKi-^-^sup^gp'p] — = 1. Next, we 

choose K3 > 1 with e'^^ + 1 < e'^^^'^^ and we fix m G N such that 

Eexpl— i- sup |Lf — Sij^l I + Eexpl Ki sup — SiJj™"^ 

I 3 t6[o,r] J I V'i9^ie[o,T] 

We apply the coupling introduced in step 2 and estimate 
Eexpl-^ sup |Lf — Sij^l I < Eexpl sup |Lt — 



L 3 v^/i t6[o,T] J I V-dh te[o,r] 

+ EexpjKi^^ sup IL^""^ - S^l""^! 
I Vi9/ite[o,T] 

+ Eexp|Ki— ^ sup ISSJ""^ - SStI 

< exp|K2 log(^^ V I + e'^2+^^ 

Straightforwardly, one obtains the assertion of the theorem for ci = ki/3 
and C2 = K2 + 2^3 . □ 



Corollary 6.2. The coupling introduced in Theorem 6.1 satisfies 
E 



sup \Lt-T.Bt\'^ 

46[0,T] 



1/2 V^/i/ , fa'^T 

<^(c.log^^Ve)+2), 



where ci and C2 are as in i/ie theorem. 

Proof. We set Z = supjgp^T] \Lt - ^Bt\ and to = ^C2log(^ V e), 
and use that 

roo rco 

(16) E[Z^] = 2 tF{Z>t)dt<tl + 2 tF{Z>t)dt. 

■Jo J to 

By the Markov inequality and Theorem 6.1, one has for s > 



exp{ci/(^/^/i)(s + to)} I 
We set a = ^fdhjci^ and deduce together with (16) that 

E[Z'^] < tl + 2 (s + to)exp|-is|ds = tg + 2toa + 2a2 < (to + 2a)^ ^ 
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7. Coupling the Gaussian approximation. We are now in the position to 
couple the processes L" and B introduced in Section 3.1. We adopt again 
the notation of Section 3.1. 

To introduce the coupling, we need to assume that Assumption UE is 
valid, and that e G (0, e' G [2e, 1] and h G (0, \)\ are such that v{B{^, h)") < 
p Recall that L" is independent of W and L'. In particular, it is independent 
of the times in J, and given W and L' we couple the Wiener process B 
with L" on each interval [Tj,Tj+i] according to the coupling provided by 
Theorem 6.1. 

More explicitly, the coupling is established in such a way that, given J, 
each pair of processes (^t+T, --Br, )te[o,T,+i-T,] and m^T, -^t, )tG[o,r,+i-T,] 
is independent of P^, L' and the other pairings, and satisfies 



E 



exp 



(17) 



I sup \L'{ - L'^^ - {Y'Bt - Y'Bt^ 



<exp^2log('^^^^^^%i^Ve 



for positive constants ci and C2 depending only on dx, see Theorem 6.1. In 
particular, by Corollary 6.2, one has 



E 



(18) 



sup m - L'^. - {E'Bt - S'5t,)P|J 



1/2 



< cs/ilog 



i+1 ~ 



/l2 



Ve 



for a constant C3 = C3{dx,'&)- 



Proposition 7.1. Under Assumption UE, there exists a constant n de- 
pending only on K , d and dx such that for any £ G (0, e' G [2e, 1] and 
h G (0, fi] with u{B{0, hf) < i one has 



E 



sup|y/ 

■[0,1] 



^/|2 



1 9 

< K—h"^ log 



e'F{h) 

/l2 



Ve 



Proof. For ease of notation, we write 
At 



T" 



and A[ = T! B^(^t) . 



By construction, (At) and {A[) are martingales with respect to the filtration 
(Tt) induced by the processes (Wt), (L'J, (At) and {A[). Let Zt = - 



Z't 



y 



is similar to the proof of Proposition 4.1. 



T^(^) and z{t) = E[sups&[o,t] l^sP]- The proof 
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Again, we write 



Zt = / (a(i;'(,_))-a(T:(,_)))d(SM/.+L'J+ / a{Y:^^,_^) dA, - / a(T;(,_))dA; 

JO ^0 ^0 

— •.Alt (localmartingalc) 

(19) ^ 

Jo 

Denoting M' = T.W + L',we get 

dMt = (a(y-/(,_)) - a(T:(,_))) dA// + a(y;;(,_)) d(^< - A',) 

+ («(>;'(,_)) -a(T;(,_)))dA; 

and, by Doob's inequality and Lemma A.l, we have 



(20) 



E 


sup Af s p 




E 















+ E 


[i 







zM2d(^') 



+ E 



i\Y^^^_^\ + lfd{A-A') 



Each bracket (•) in the latter formula can be chosen with respect to a (pos- 
sibly different) filtration such that the integrand is predictable and the inte- 
grator is a local L^-martingale. As noticed before, with respect to the canon- 
ical filtration {Tt) one has d{M')t = (jSp + /^(^ \x\^i^{dx)) dt < 2K'^dt. 
Moreover, we have with respect to the enlarged filtration (J^t Vo"(JI))(>o, 

{A')t= Yl {Tj-Tj^i)F{h) = max{Sn[0,t])-F{h), 

{jen:T,<t} 

and, by (18), for j e N, 

A{A - a')t, = - ^T,_i - (s'i?T, - s'i?T,_ ji'iJ] < cie, 

where ^ := h\og{^-^^ V e). Note that two discontinuities of {A — A') are at 
least e'/2 units apart and the integrands of the last two integrals in (20) are 
constant on {Tj^i,Tj\ so that altogether 

E sup IMJ^ <Ki 2K^E / \ZTds +F{h)'E / iZ'/Pds 



+ l)'ds 



sup \Ms\^ 


< Ki 


2K^E 


f\Z'fds 


+ F{h)E 


li 


'-se[o,t] J 






Jo 











ri(s-) 



With Lemma A. 2 and Fubini's theorem, we arrive at 

''\{s)ds+eli 



E 


sup |Msp 


< K2 
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Moreover, by Jensen's inequality, one has 



E 


sup 


I 




.sG[0,t] 





EWZ 



I |2i 



ds. 



Combining the latter two estimates with (19) and applying Gronwall's in- 
equality yields the statement of the proposition. □ 

Proposition 7.2. There exists a constant k depending only on K and 
dx such that 



E[||y-y'-(T-T')||Y^^< 



h 



log 1 + - +lo, 



fF{h)e' 

+ ,/F(/i)e'log4E[||y'-T'||Y/' 



Proof. Note that 

Yt - Yl - {% - %) = a(y;,))(L;' - - a(T;(,))(S'S, - 

= a(i;'(,))(L'/ - - {^'Bt - 

+ («(!;'(,)) -a(T;(,)))(S'i?,-S'i?,(,)). 

Similar as in the proof of Proposition 5.2, we apply Lemma A. 4 to deduce 
that 



E[||y-y'- (T-T')llY^^ 



(21) <KE[{\\Y'\\ + lf]^/^E 

+ KE[\\Y' - T'llY^^E 



sup|L;'-L"(,)-(S'i3t-S'i3,(,): 

4e[o,i] 



1/2 



sup \T,'Bt - S'-Br,(t)| 
■te[o,i] 



1/2 



Next, we estimate E[sup(g[o^i] \L'-l — L''^(^^^ — {Ti'Bt — S'fi;j(t))p]. Recall that 
conditional on J, each pairing of {L'l^^j,, — L'^.)te[o,Tj+i~T.j] and {Bt+Tj — 
BTj)t&[o,Tj+i~Tj] is coupled according to Theorem 6.1, and individual pairs 
are independent of each other. 

Let us first assume that the times in J are deterministic with mesh smaller 
or equal to e' . We denote by n the number of entries of J which fall into [0,1], 
and we denote, for j = 1, . . . ,n, Aj = supt^iTj^uTj] \^t - - i^'Bt - 

T,' B'T _j^)\- By (17) and the Markov inequality, one has, for u>0, 



sup Aj > < ^P(Aj > u) <nexp|c2 logf ^^j^^^ 



Ve 



ci 

V^h 
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Let now a 
u> 



so that 



p = EW and uo = ^(logn + C2log(/3e' V e)). Then for 



sup A,- > -u) < e-"("-"«) 



E 



sup Aj 

j=l,...,n 



j=l,...,n 



uF ( sup Aj >u] du 

^j=l,...,n 



<ul + 2 e-"^"-"'') du = ul + 2-uo + 2— < uq + - . 
Juo « V «/ 

Note that the upper bound depends only on the number of entries in Jn [0, 1] , 
and, since #(JIn [0,1]) is uniformly bounded by p + 1, we thus get in the 
general random setting that 

' sup m - L';,) - {J:'Bt - 

46 [0,1] 



E 



< 



Cl 



log( 1 + - ) +C2l0| 



Together with Lemma A. 2, this gives the appropriate upper bound for the 
first summand in (21). 

By the argument preceding (15), one has 



E 



1/2 



sup < Ki\^'\ J £' log- = Ki J F{h)e' log - 



where ki is a constant that depends only on dx- This estimate is used for 
the second summand in (21) and putting everything together yields the 
statement. □ 

8. Proof of the main results. 



Proof of Theorem 1.1. We consider a multilevel Monte Carlo algorithm 
S ^ A partially specified by := 2~*^ and hk := g~^{2^) for k e Z+. The 
maximal index m G N and the number of iterations ni, . . . , rim S N are fixed 
explicitly below in such a way that hm < f) and m>2. Recall that 

m 

mse(5) < W(y,T('"))Vy — E[||TW -T('=-i)f] + — E[||TW -yofl; 



k=2 



nk 



ni 



see (6). We control the Wasserstein metric via Corollary 3.2. Moreover, we 
deduce from [6], Theorem 2, that there exists a constant kq that depends 
only on K and dx such that, for k = 2, . . . ,m, 

E[||tW - T('^-i) f ] < Ko(efc_i log(eM„i) + F(/i,„i)) 
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and 



E[||tW - yof ] < Koieolog{e/eo) +F{ho)). 
Consequently, one has 

1 \ , e 

- + 

k=0 



mse(S') < Ki 



hi 



+ em ] loL 



m— 1 

+ E — 



F{hk)+ek\og- 



(22) 

in the general case, and 



mse(S') < K2 



(23) 



/i^-^log-^ + |6-Fo(/i)|2e^ 



fc=0 



F{hk) + Ek log — 



in the case where S = 0. Note that F(/ifc) < hlg{hk) = g~^{2^f2^. With 
Lemma A. 3, we conclude that hk = g~^{2^) ^ {l/2)^ so that e^log^ = 

2^^ \og{e2^) ;^ g~^{2'')^2^ . Hence, we can bound F{hk) + ek log ^ from above 
by a multiple of hlgQik) in (22) and (23). 

By Lemma A. 3, we have |-Fo(^m)| ^ hm/^m as oo. Moreover, in the 
case with general S and g~^{x) ^ x~^/^, we have ^myj^ ~ ^m- Hence, in 
case (I), there exists a constant K3 such that 

m—l 



(24) 



mse(5) < K3 



log + E z—'^laihk) 



Em ^ nfc+i 



Conversely, in case (II), i.e. g ^{x) ;^ x the term /i^-^ is negligible 
in (22), and we get 

m— 1 ^ 

e^log h ^ hlg{hk) 



(25) 



mse(5) < K4 



for an appropriate constant K4. 

Now, we specify ni, . . . ,12^ in dependence on a positive parameter Z 
with Z > l/c/-i(2'"). We set n^+i =nfc+i(Z) = [Zg~^{2'')\ > \Zg^^{2^) for 
A; = 0, . . . , m — 1 and conclude that, by (30), 



m— 1 



m—l 



m—l 



fc=0 



nfc+1 



A;=0 



^ — 'T'fc+l — " ' ^ ^ — 

(26) 



m— fc 



^ m—l 

K5-2™9-i(2™) ^^7"^"^-'^ 

/c=0 
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Similarly, we get with (7) 
(27) cost(5) < 3 ^ < kqZ2'^ {2'' 



m—l 



k=0 



We proceed with case (I). By (24) and (26), 



(28) 



mse(5) < K7 



g-\2"'f2'^/^m+^2"'g-\2' 

Zj 



so that, for Z : = 2'^l'^ j {rng-^{2'^)), 

mse(5) < 2k75-^(2™)22™/2^ 

and, by (27), 

2(3/2)"^ 
cost (5) < Kg- 



For a positive parameter r, we choose m = m(T) G N as the maximal 
integer with kq2^^I'^^'^ jm < r. Here, we suppose that r is sufficiently large to 
ensure the existence of such a m and the property hm < f}- Then cost (5) < r. 
Since 2"^ (rlogT)^/^, we conclude that 

mse(5);i5-H(TlogT)2/3)VV3(i„g^)4/3^ 
It remains to consider case (II). Here, (25) and (26) yield 
mse(S) < K8 2~"'m + ^2"'g-\2"') 

Zj 

so that, for Z := ^2^"'g-^{2"'), 

mse{S) < 2K82-™?n 

and, by (27), 



1 



cost{S)<K6—2^"'g-\2"'f. 
m 

Next, let Z G N such that 2kq2^^j~'^^ < 1. Again we let r be a positive param- 
eter which is assumed to be sufficiently large so that we can pick m = m{Z) 
as the maximal natural number larger than / and satisfying 2™'''' < g*{T). 
Then, by (29), 

^ 1 / 2 \ 1 

cost(5) < Ke—2^"'g-\2"'f < 2^62-^^ - n^im+l) g~U2m+h2 < ^_ 

m \^ J m + l 
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Conversely, since 2"™- < 2'+^5'*(r), 

mse(5) < 2K82'+i5*(r)^Mog2 5*(T). 

Moreover, g~^{x) ^x~^ so that x"^ (x)'^ / log x x / log x , as x— t-oo. This 
imphes that log g*{T) ;^logr. 

Proof of Corollary 1.2. We fix /?' G (/3,2] or /5' = 2 in the case where 
/3 = 2, and note that, by definition of /3, 



xl'^ v{dx) 
B(0,1) 



is finite. We consider g : (0, oo) — )• (0, oo),h i— )• / ^ A li/(dx). For h G (0, 1], 
one has 

» I |2 /■ I |2 

5(^)= / ^Alz^(dx)+ / ^Alz^(dj;) 
Jb{o,i) JB{0,1Y " 



< / ^z/(dx) + / li/(dx) < K2/i~^' 



where K2 = ni + i'{B{0, ly). Hence, we find a decreasing and invertible func- 
tion g : (0,oo) — )• (0, oo) that dominates g and satisfies g{h) = K2h~^ for 
h £ (0, 1]. Then for 7 = 2^-^/?', one has 5(^/1) = 2g{h) for /i G (0, 1] and we 
are in the position to apply Theorem 1.1: In the first case, we get 



err(T) :< r-(4-^')/(6/3')(logT)(2/3)(i-W). 

4 
3 



In the second case, we assume that /5' < | and obtain g* (r) ~ (r log r) ^ / 
so that 

^-/3'/(6/3'-4) (;3'_i)/(3;3'_2) _ 

These estimates yield immediately the statement of the corollary. 

APPENDIX 

Lemma A.l. Let (At) be a previsible process with state space M'^^^'^^, 
let (Lt) be a square integrable W^^ -valued Levy martingale and denote by 
{L) the process given via 

dx 

{L), = Y,{L^'\, 
i=i 

where (L^^^) denotes the predictable compensator of the classical bracket pro- 
cess for the jth coordinate of L. One has, for any stopping time r with finite 
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expectation E |^spd(L)s, that (Jq ^ AsdLs)t>o is a uniformly square in- 
tegrable martingale which satisfies 



E 



<E / \As\'d{L), 



The statement of the lemma follows from the Ito isometry for Levy driven 
stochastic differential equations. See, for instance, [6], Lemma 3, for a proof. 

Lemma A. 2. The processes Y' and T introduced in Section 3.1 satisfy 



E 



sup \Y!,-yQ\ 

■s6[0,l] 



< K and E 



sup [Ts - yol 

se[o,i] 



where k is a constant that depends only on K . 



Proof. The result is proven via a standard Gronwall inequality type ar- 
gument that is similar to the proofs of the above propositions. It is therefore 
omitted. □ 

Lemma A. 3. Let /i > 0, 7 G (1, 2) and g : (0, 00) — (0, 00) be an invertible 
and decreasing function such that, for h £ (0, h] , 

g(^U^ >2g{h). 

Then 

(29) ^g~\u)<g-\2u) 

for all u>g{h). Moreover, there exists a finite constant Ki depending only 
on g such that for all k, I £ Z+ with k < I one has 

(30) 5-^(2'^) <^i(^)' V'(2'). 

If i'{B{0,hy) < g[h) for all h>0, and v has a second moment, then 
\ \x\v{dx') < K^ijigifi) 

where K2 is a constant that depends only on g and /|xpz^(dx). 

Proof. First, note that property (2) is equivalent to 

lg-'iu)<g~H2u) 
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for all sufficiently large n > 0. This implies that there exists a finite constant 
Ki depending only on g such that for all /c, / € Z+ with k < I one has 



For general, h> one has 
|2;|z^(dx) < 



B{0,h)'= 



B{0,hYnB{0,h) 



\x\u{dx) + ^ [ |x|^i/(da 
h J 



Moreover, 



B{0,h)'=nB{0,h) 



n=0 

oo 

n=0 



h)]h{- 



n+1 



{ft(2/7)"<h} 



9{h 



n+1 



<2-"3(h) 



<2/ig(/j)^7-("+i). 



n=0 



□ 



Lemma A. 4. LetnGN and (^j)j=o,i,...,n denote a filtration. Moreover, 
let, for j = 0, . . . ,n — l, Uj and Vj denote nonnegative random variables such 
that Uj is Qj -measurable, and Vj is Qjj^i-measurable and independent of Qj. 
Then one has 



E 



max Uj Vj 
Lj=0,...,n-1 ■' . 



< E 



max Uj 

.jr=0,...,n-l 



■E 



max Vj 

j=0,...,n-l 



Proof. See [6]. □ 
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